Skip to content
This repository has been archived by the owner on May 9, 2024. It is now read-only.

Add execution mode option to PyHDK python tests #518

Merged
merged 5 commits into from
Jun 28, 2023
Merged

Conversation

kurapov-peter
Copy link
Contributor

The PR introduces a --device option to pytests to allow for switching between execution modes. The default behavior remains unchanged.
It also provides a quick simple fix (by disabling L0 compilation context caching with enable-gpu-code-compilation-cache=false) to the following problem:
When HDK components are handled manually and the data manager is reinitialized the compilation cache may outlive the GPU manager (since its lifetime is bound to the data manager's lifetime). The compilation context would then try to access an invalid instance L0Device with a valid device handle (the driver returns the same handle on consequent calls).

@ienkovich
Copy link
Contributor

Is enable-gpu-code-compilation-cache=false a temporary solution?

Also, can we choose a device via pyhdk.init() in test suite setup or in another similar way? It feels wrong that users have to choose an execution device in each run call.

@kurapov-peter
Copy link
Contributor Author

Is enable-gpu-code-compilation-cache=false a temporary solution?

Yep. The data manager could reset the cache when it's destroyed, but that would create a cycle dependency with executor now. Flushing the cache on executor destruction also doesn't seem right. I think the right thing to do is get rid of the device pointer in the compilation context. There's a couple questions for such a change though that I'll need to address (for example: if we only use the device id to route execution what happens if the new executor has less devices available).

Also, can we choose a device via pyhdk.init() in test suite setup or in another similar way? It feels wrong that users have to choose an execution device in each run call.

We could, but why would you? I think in the end we will have a default setting that allows any possible offloading scheme. If there's any problem with a particular query (i.e. the cost model is very wrong on the device preference), you can force it to run with a specific setting.

We also do have the cpu-only setting for pyhdk.init already, so this is not how a user would use the API. We have these options present here only to simplify testing and make it explicit.

@ienkovich
Copy link
Contributor

Yep. The data manager could reset the cache when it's destroyed, but that would create a cycle dependency with executor now.

You can do it by adding a new GpuCodeCache interface that Executor would implement and then clear the executor's cache through this interface to avoid the cycle dep. We've used a similar trick with Executor before.

We also do have the cpu-only setting for pyhdk.init already, so this is not how a user would use the API. We have these options present here only to simplify testing and make it explicit.

I don't feel the testing is more simple if I have to add an additional arg to each test function, each sql call, and each run call. It is just a global testing option, why not set it once somewhere?

@kurapov-peter
Copy link
Contributor Author

You can do it by adding a new GpuCodeCache interface that Executor would implement and then clear the executor's cache through this interface to avoid the cycle dep. We've used a similar trick with Executor before.

#520

I don't feel the testing is more simple if I have to add an additional arg to each test function, each sql call, and each run call. It is just a global testing option, why not set it once somewhere?

It's not "each test function", only those that need the option. It is also not a global testing option. It may seem so now because we currently allow CPU fallbacks. Simplicity should come with explicitness. I'd like to forbid fallbacks in tests later on so that we are sure a query runs on the device we asked for. Some of those we will want to fall back. Having a global setting seems inconvenient for that.

I agree that it would be nice to type less. I initially added the cpu-only as a "parameterize" clause for that very reason. Yet setting cpu-only globally changes data manager behavior (we don't create a gpu manager at all), so it's less flexible and doesn't cover the "have-gpu-manger-but-run-on-cpu" case. It also wouldn't allow running both cases in the same env -- you'd need a reset as we don't expose the config.

There're a couple of more-or-less straightforward ways to set the device type once of course. Is it worth it though?

@alexbaden alexbaden merged commit f421fae into main Jun 28, 2023
@alexbaden alexbaden deleted the pakurapo/pytest-l0 branch June 28, 2023 02:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants